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Least square fitting with one parameter less 
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It is shown that whenever the multiplicative normalization of a fitting function is not known, least 
square fitting by y 2 minimization can be performed with one parameter less than usual by converting 
the normalization parameter into a function of the remaining parameters and the data. 


Erratum: Jochen Heitger and Johannes Voss of 
Miinster University found a typo in two subroutines 
which are used for the 4-parameter fit example given in 
subsection III D of this paper. In each of the subroutines, 

subg_power2n.f and suby_power2n.f, 

a(2) has to be replaced by a(l) in the expression for the 
derivative dyda (1). Elimination of this error improves 
the performance of the fitting method considerable. 

In the following subsection HID is re-written to re¬ 
flect use of the correct subroutines and the error prone 
routines have been replaced in the package FITMl.tgz, 
which is still available on the Web as described. Other 
parts of the paper are not affected. 

I. INTRODUCTION 


In section II explicit equations for co(ai,..., a n _i) and 
its derivatives with respect to the parameters are derived. 
Practical examples based on Levenberg-Marquardt fit¬ 
ting [1,2] are given in section III. The conclusion from 
section III is that we have an additional useful approach, 
which mostly converges faster than the corresponding 
fit with the full number of parameters. Conclusion and 
an outlook on an eventual application are given in sec¬ 
tion IV. All examples of this paper can be reproduced 
with Fortran code that is provided on the Web and doc¬ 
umented in appendix A. In any case, the code provided 
here should be useful. 


II. CALCULATION OF THE NORMALIZATION 
CONSTANT AND ITS DERIVATIVES 


The general situation of fitting by y 2 minimization is 
that to data points yt = y(xi) with error bars A yi and a 
function y{x\ a,j) with j = 1,..., n parameters are given 
and we want to minimize 
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with respect to the n parameters, where we neglect (as 
usual) fluctuations of the Aj/j error bars. 

In many practical application one of the n parameters, 
say a„, is the multiplicative normalization of the function 
y{x\aj) so that it can be written as 

y(x\a,j) =a n f(x-,ai,...,a n - 1 ). (2) 

With the parameters a i,..., a„_i fixed there is a unique 
analytical solution cq for c = a n , which minimizes y 2 (c), 
so that the function y(x;a,j) depends effectively only on 
n — 1 parameters: 

y(x; or,..., a n -i) = (3) 

c 0 (ai,..., a„_i) f(x; ax,..., a„-i). 


This can be exploited to perform least square fitting with 
one parameter less. Remarkably, the number of steps re¬ 
quired for one calculation of y 2 remains linear in the 
number of data. Although the derivation of these result 
is rather straightforward, I have not encountered them in 
the literature, though I found myself frequently in situa¬ 
tions of performing least square fits of the type to which 
this method can be applied. 


We find the normalization constant co(ai,..., a n _i) 
by minimizing y 2 (c), c = a n for fixed parameters 
ax,. ■■ ,a n - 1 - We define 
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so that the derivative with respect to c is zero at the 
minimum y 2 (co): 
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This implies for cq the solution 
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As it should, this equation reduces for just one data point 
(to = 1) to c 0 = yx/f(xx). 

For fixed parameters ax, ■ ■ ■, a n -i the error bar Aco of 
cq follows from the variances of the data points: 
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where | a indicates that the parameters ax,..., a n - 1 are 
kept fixed. When all error bars of the data agree, i.e., 
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A yi = Ay for i = 1,..., m holds, equations (6) and (8) 
simplify to 
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If, in addition, the function f(x) is a constant, f{xi) = /o 
for i = 1 ,m, we find the usual reduction of the vari¬ 
ance through sampling: (Aco) 2 | a = (A y/f 0 ) 2 /m. Note 
that the error bar Eq. (8) does not hold when the pa¬ 
rameters ai,..., a n -i are allowed to fluctuate, i.e., have 
themselves statistical errors Aoj. Then the propagation 
of these errors into Co(ai,..., a„_i) has to be taken into 
account, which is done below. Eq. (8) is mainly of rel¬ 
evance for the n = 1 case when cq eliminates the sole 
parameter a\. 

In our illustrations based on the Levenberg-Marquardt 
approach as well as for many other fitting methods one 
needs the derivatives of the fitting function with respect 
to the parameters. With Eq. (3) this become 
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Using the derivatives (12) the full variance of co with the 
associated error bar Acq = \ J (Acq ) 2 becomes 
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where Cj k is the covariance matrix of the parameters 
ai,... ,a n - 1 , which is in our examples returned by the 
Levenberg-Marquardt fitting procedure.. 


III. EXAMPLES 

In this section we summarize results from least square 
fits with n = 1 to n = 4 parameters cq, i = 1 ,n, 
where the last one, a n , is always taken to be a multi¬ 
plicative normalization. The corresponding Fortran code 
is explained in appendix A. 

We apply the Levenberg-Marquardt method in each 
case to all n parameters as well as to n —1 parameters by 


considering co, the least square minimum of a n given by 
(6), to be part of the function. The Levenberg-Marquardt 
method uses steepest decent far from the minimum and 
switches to the Hessian approximation when the mini¬ 
mum is approached. Our Fortran implementation is a 
variant of the one of Ref. [3]. Besides the fitting function 
y(x]CLi ) one has to provide the derivatives 
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and start values for the n, respectively n—1, parameters 
a-i. Usually the method will converge to the nearest local 
minimum of y , i.e., the minimum which has the initial 
parameters in its valley of attraction. 

Our choice of data for which we illustrate the method 
is rather arbitrary and emerged from considerations of 
convenience. For the n = 1 to n = 3 parameter fits 
we use cleconfining temperatures estimates from Markov 
Chain Monte Carlo (MCMC) simulations of 4D SU(2) 
gauge theory on NfN T lattices as reported in Table 4 of 
a paper by Lucini et al. [4]. We aim to extract from them 
corrections to asymptotic scaling by fitting (aT c ) _1 = 
A t (/ 3 c ) to the form [5] 
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where N T is the temporal extension of a N^N T lattice and 
/“(/3, N) is the universal two-loop asymptotic scaling 
function of SU(N) gauge theory 

K'(J3,N) = e -V(2W) (bog 2 )- b ^ 2b2 °\ g 2 = ™ (19) 

with b 0 = (A/167T 2 ) (11/3) the one-loop [6,7] and b\ = 
(A r /167T 2 ) 2 (34/3) the two-loop result [8,9]. 

In Table 4 of [4] the error bars are for the critical cou¬ 
pling constants /3 C . For the purpose of the fit (18) the 
error bars are shuffled to N(/3 C ) by means of the equa¬ 
tion 

A (aTe)" 1 = - \fx((3c) ~ fx(J3c + A&)] , (20) 

/a(Pc) 

where a preliminary estimate of the f\((3) scaling func¬ 
tion (17) is used. The thus obtained data (omitting 
N t < 4 lattices) are compiled in Table I. 

For the n = 4 parameter fits results from Bhanot et 
al. [10] for the imaginary part Im(u) of the partition func¬ 
tion zero closest to the real axis are used, which are ob¬ 
tained from MCMC simulation of the 3D Ising model on 
Nf lattices. These data are also collected in our Table I. 
To leading order their finite size behavior is 

y = Im(tt) = <22 x ai with x = N s , (21) 
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TABLE I. Data used for our least square fitting examples. 


Lucini et al. [4] 

Bhanot et al. [10] | 

Pc 

JVr(/?c) 

N s 

Im(-u) 

2.29860 

4.0000 (77) 

4 

0.087739 (5) 

2.37136 

5.0000 (86) 

5 

0.060978 (5) 

2.42710 

6.0000 (32) 

6 

0.045411 (5) 

2.50900 

8.0000 (32) 

8 

0.028596 (5) 

- 

- 

10 

0.019996 (5) 


where a\ is related to the critical exponent v of the cor¬ 
relation length by ai = —\jv. In the context of our 
method this 2 -parameter fit is of no interest, because it 
can be mapped onto linear regression for which the y 2 
minimum leads to analytical solutions for both parame¬ 
ters, ai and a ,2 ■ For the critical exponent v this yields 
1/v = 1.6185(2), but has an unacceptable large y 2 , which 
leads to Q = 0 for the goodness of fit (details are given 
in [3]). 

One is therefore led to including subleading corrections 
by moving to the 4-parameter fit 

y = Im(u) = 04 x ai (1 + a2x“ 3 ) = 04/(2;) (22) 

for which our method replaces 04 by cq ( 6 ) as function of 
the other parameters. We are now ready to present the 
results for our fits. 


A. 1-parameter fits 

The function (17) reduces to the from 

V = (*» 

and there are no fitting parameters left when the ana¬ 
lytical solution co ( 6 ) with the error bar Aco form ( 8 ) is 
used for 04 . 

Our Levenberg-Marquardt procedure works down to a 
single fit parameter and uses besides the fitting function 
(23) the only derivative 


da 1 /r 09) ‘ V ' 

Using the start value ai = 0.0628450 one finds conver¬ 
gence after three iteration with the results 

ai = 0.025336 (26) and y 2 = 4263. (25) 

Without any iteration identical values for ai and y 2 are 
obtained by using the analytical solution co and its error 
bar ( 8 ). Note that y 2 has to be the same for identical 
parameters. So Co still counts when it comes to counting 
the degrees of freedom. 

Obviously the obtained y 2 is unacceptable large and it 
is well visible from the 1-par curve in Fig. 1 that this fit is 
not good. Additional parameters are needed to account 
for corrections to asymptotic scaling. 



FIG. 1. Uniform distribution: Impossible (left) and miss¬ 
ing events (in the enlarged thickness of the right border). 


B. 2-parameter fits 

The function (17) is now reduced to the form 

y=NAm = 7m( 1+ i)' (26) 

For the fit with two parameters we use the start values 
0,2 = 0.0628450 (as for a\ before) and 04 = —1.43424. 
After six iterations we find convergence and the results 

ai = -1.5751 (84), o 2 = 0.07460 (77), y 2 = 117. (27) 

Running the fitting routine now just for the parameter 
ai by eliminating 02 in the described way, we find the 
results (27) after four iterations. In particular, also the 
error bar of Co, now calculated via Eq. (15), agrees with 
the error bar of 02 . 

Clearly the y 2 is still too large to claim consistency 
between the fit and the data though the visible improve¬ 
ment is considerable. See the 2-par curve in Fig. 1. 

C. 3-parameter fits 

We fit now to the full functional form (17). For the 
three parameter fit the previous starting values are re¬ 
shuffled <12 —> < 23 , On -A 0,2 and the additional starting 
value is taken to be ai = 1. Our Levenberg-Marquardt 
procedure needs 245 iterations for convergence and yields 
the values 

oi = 4.685 (83), o 2 = -4.199 (46), (28) 

03 = 0.395 (30), y 2 = 0.156, (29) 

which are rather different than the corresponding results 
03 -A 02 and 02 -A a\ (27) of the 2-parameter fit. 
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Identical values (28), (29) are obtained after eliminat¬ 
ing the normalization, here 03 , from the fit parameters 
used in the Levenberg-Marquardt iteration and we find 
a reduction of iterations from 245 to 12. 

The x 2 is now small enough to signal consistency be¬ 
tween the data and fit, although the 2-par and 3-par 
curves in Fig. 1 are hard to distinguish visually. Con¬ 
verting the x 2 °f (29) into a goodness of fit one finds 
Q = 0.69. 

D. 4-parameter fits 

The aim is to perform the 4-paranreter fit (22) for the 
data of Bhanot et al. (Table I). Using all four parameters, 
our first set of starting values, 

a\ = — 1 . 6 , a .2 = 0 . 1 , 03 = — 1 . 0 , 04 = 0 . 8 , 

is chosen so that Oi stays close to the 2 -parameter fit 
result —ai = 1/v = 1.6185(2) from [3]. Running our 
Levenberg-Marquardt fit program on these initial values, 
it needs 391 iterations to converge and finds 

oi =-1.5981(31), o 2 = 0.77(39), (30) 

o 3 = -2.80(52), 04 = 0.7917(61), x 2 = 0.113, (31) 

where the x 2 can be converted into a goodness of fit Q = 
0.74. 

Eliminating the normalization 04 from the direct fit¬ 
ting parameters, convergence is reached after 58 itera¬ 
tions and we find identical estimates as before: 

oi =-1.5981(31), o 2 = 0.77(39), (32) 

o 3 = -2.80(52), c 0 = 0.7917(61), x 2 = 0-H3, (33) 

Noticeably, our example features a rugged landscape for 
X 2 as function of the parameters. With different starting 
values 

Oi = —4.4, 02 = 1.3 , 03 = 2.8 , 04 = 0.6 

a minimum entirely different from the one above was 
found by accident. Running our Levenberg-Marquardt 
procedure on these starting values convergence is reached 
after 9 iterations and gives 

oi = -4.40 (53), o 2 = 1.31 ( 66 ), (34) 

03 = 2.80(52), o 4 = 0.61(31), x 2 = 0-H3- (35) 

Eliminating the normalization 04 we find convergence af¬ 
ter 8 iterations and the same results: 

oi = -4.40 (53), o 2 = 1.31 ( 66 ), (36) 

03 = 2.80(52), o 4 = 0.61(31), x 2 = 0.113. (37) 

Either set of parameters fits the data perfectly well 
in their range, while the different fits function diverge 
quickly out of this range, i.e., for larger lattices. 


IV. CONCLUSION AND OUTLOOK 

This paper shows that we can exclude the multiplica¬ 
tive normalization of a fitting function from the variable 
parameters of a x 2 minimization and include it into the 
fitting function (it still counts when it comes to determin¬ 
ing the degrees of freedom). Our simple examples show 
that this works well, reducing the number of iterations. 

The code discussed in appendix A shows that there is 
no extra work involved for the user. Once the general, 
application independent, code is set up, the subroutine 
which the user has to supply has one parameter less than 
the one needed for the usual Levenberg-Marquardt fitting 
procedure (compare the reduction from subg_la3su2. f 
to suby_la3su2. f). Besides, it is useful to have alterna¬ 
tives at hand when one is trying to find convenient initial 
values. 

Finally, there may be interesting applications, which 
cannot be easily incorporated into conventional fitting 
schemes. For instance, there are situations where dis¬ 
tinct data sets are supposed to be described by the same 
function with different multiplicative normalizations. An 
example is scale setting in lattice gauge theories [11]. The 
method of this paper can then be used to eliminate all 
multiplicative constants from the independent parame¬ 
ters of the fit so that one can consolidate all data sets 
into one fit. This application will be pursued elsewhere. 
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APPENDIX A: FORTRAN IMPLEMENTATION 

For a limited time the Fortran code can still be down¬ 
loaded as archive FITM1. tgz from the authors website 
at* 

http://www.hep.fsu.edu/~berg/research/research.html. 
After downloading the FITM1. tgz file, it unfolds under 

tar — zxvf FITMl.tgz 


*A previous version of this paper has been published in 
Computer Physics Communication 200 (2016) 254-258 and 
the program package is available from their program library. 
When using these programs, the typo in the two relevant sub¬ 
routines needs to be corrected to reproduce the results of sub¬ 
section HID. 
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into a tree structure with the folder FITM1 at its top. 
On the next level there are two subfolders examples and 
libs. To reproduce the results of section III go to the 
examples folder, where one finds the subfolders lpar, 
2par, 3par and 4par. 

Each of these subfolders contains two Fortran pro¬ 
grams, ngfit.f and ngfitml.f, where n = 1,2,3 or 
4 denotes the number of parameters. Both programs are 
ready to be compiled and run, say with . /a. out>a.txt. 
The thus produced results should agree with those found 
in the text file a n. txt and anml. txt. In addition graph¬ 
ical output is created. Type 

gnuplot gfit.plt 

to see the plots (a gnuplot driver gfit.plt is located in 
each folder). 

The programs read data and starting values from files 
named fort. 10. For the n = 4 case there are two 
sets, bhanotl. dat and bhanot2 . dat, which differ by the 
initial values and the desired one has to be copied on 
fort. 10 before the run. 

With exception of lgfitml.f, which is a special case 
discussed at the end of this appendix, the programs rely 
on our Levenberg-Marquardt subroutine fit sub to find 
the minimum of y. This subroutine, is transferred after 
the end statement of each main program by an include 
statement into the code, 

include /libs/fortran/fitsub.f ’ 

and in the same way this is done for all others routines 
needed. These include statements allow for easy track¬ 
ing of the location of the source code of each routine. 

The crucial difference between the two main pro¬ 
grams is: ngfit.f calls fitsub for nfit=n parame¬ 
ters, but ?jgfitml calls it for nfit=n—1 parameters. 
This is achieved by feeding distinct subroutines subg 
into f itsub, which define the fitting functions and their 
derivatives with respect to the nfit parameters. For in¬ 
stance, for the 3-parameter case the subroutine used by 
the run of 3gfit. f is subg_la3su2. f as listed here: 

subroutine subg(x,a,yfit,dyda,nfit) 
c BB May 20 2015. User provided subroutine 
c for a 3-parameter fit of the pure su2 scale, 
c yfit=a3*(l+a2/x+al/x**2)/flasu2(x,2) with x=beta 
c and flasu2(x,2) the asymptotic su2 scaling 
c function (yfit has the dimension of a length). 
include /libs/fortran/implicit.sta’ 

include ’../../libs/fortran/constants.par’ 
dimension a(nfit),dyda(nfit) 
if(nfit.ne.3) stop "subg_la3su2: nfit.ne.3." 
x2=x**2 

flasu2=fla(x,itwo) 

dyda(3)=(one+a(2)/x+a(l)/x2)/flasu2 

yfit=a(3)*dyda(3) 

dyda(2)=(a(3)/x)/flasu2 

dyda(l)=(a(3)/x2)/flasu2 

return 

end 


The 3gftiml.f program and all other ngfitml.f pro¬ 
grams, but n=l, include subgf itml .f with the following 
code: 

subroutine subg(xx,aa,yfit,dyda,nfit) 
c BB May 20 2015. Generic routine for least square 
c fit with one parameter less. Input needed: User 
c provided subroutine suby for unnormalized fit 
c function and its derivatives with respect to 
c its parameters. 

include ’../../libs/fortran/implicit.sta’ 
include ’../../libs/fortran/constants.par’ 
include ’../../libs/subs/common_fitting.f’ 
c For this routine the common transfers x,y,ye. 
dimension aa(nfit),dyda(nfit) 
call chi2dcda(ndat,nfit,x,y,ye,aa.dcda,cO) 
call suby(xx,aa,yy,dyda,nfit) 
yfit=c0*yy 
do i=l,nfit 

dyda(i)=c0*dyda(i)+dcda(i)*yy 
enddo 
return 
end 

Here the call to suby defines the user supplied func¬ 
tion and its derivatives as the subg routines do for the 
?7gfit.f programs. The suby routines replace the nor¬ 
malization parameters by the number one. For our exam¬ 
ple above subg_la2su2. f becomes suby_la2su2. f given 

by 

subroutine suby(x,a,y,dyda,nfit) 
c BB may 20 2015. User provided fit function 
c y=(l+a2/x+al/x**2)/flasu2(x,2) and derivatives, 
include ’../../libs/fortran/implicit.sta’ 
include ’../../libs/fortran/constants.par’ 
dimension a(nfit),dyda(nfit) 
if(nfit.ne.2) stop "suby_la3su2: nfit.ne.2." 
x2=x**2 

flasu2=fla(x,itwo) 

y=(one+a(2)/x+a(l)/x2)/flasu2 

dyda(2)=(one/x)/flasu2 

dyda(l)=(one/x2)/flasu2 

return 

end 

This it the only routine, which the user has to define 
for the reduced fitting procedure. The chi2dcda routine 
called in subgf itml. f is generic for all these fits and 
calculates Co (6) and its derivatives (12) as defined in 
section II with the following code: 

subroutine chi2dcda(n,nf,x,y,ye,a,dcda,cO) 
c BB May 20 2015. Determination of the normalization 
c constant cO for chi2 fit of data y_i with the fit 
c function c0*f. Then, in dcda the derivatives of 
c cO with respect to the fit parameters, 
c Input: data arrays x(n),y(n),ye(n), 
c parameter array a(nf). 

c Output: dcda(nf), cO. 
c variable fit parameters a(nf). 
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include ’../libs/fortran/implicit.sta’ 
include ’../libs/fortran/constants.par’ 
parameter(mf=30) ! maximum number parameters, 

dimension drda(mf),dsda(mf),dfda(mf),a(nf), 

& dcda(nf), x(n),y(n),ye(n) 
if(nf.gt.mf) stop "chi2dcda: nf.gt.mf." 
r=zero 
s=zero 
do j=l,nf 

drda(j)=zero 
dsda(j)=zero 
enddo 
do i=l,n 

call suby(x(i),a,f,dfda,nf) 

ye2i=one/ye (i)**2 ! l/(error bar squared). 

ri=f*ye2i 

r=r+y(i)*ri ! iterate r. 

s=s+f*ri ! iterate s. 

addy=y(i)*ye2i 
addf=two*ri 
do j=l,nf 

drda(j)=drda(j)+addy*dfda(j) 
dsda(j)=dsda(j)+addf*dfda(j) 
enddo 
enddo 
c0=r/s 
do j=l,nf 

dcda(j)=(s*drda(j)-r*dsda(j))/s**2 
enddo 
return 
end 

An extension of this routine, chi 2 c 0 eb. f, calculates also 
the variance (15) of cq and needs on input the covariance 
matrix of the parameters cq ,..., a n -\. 

Finally, the program lgfitml.f is a special case, be¬ 
cause the fit sub routine does not work for nfit =0 pa¬ 
rameters. Actually, it is not needed at all in this case 
and becomes replaced by a variant of the chi 2 dcda.f 
routine: chi 2 c 0 .f calculates Co (6) and its error bar, 
now according to Eq. ( 8 ) instead of (15). 
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